8 research outputs found
An attentive neural architecture for joint segmentation and parsing and its application to real estate ads
In processing human produced text using natural language processing (NLP)
techniques, two fundamental subtasks that arise are (i) segmentation of the
plain text into meaningful subunits (e.g., entities), and (ii) dependency
parsing, to establish relations between subunits. In this paper, we develop a
relatively simple and effective neural joint model that performs both
segmentation and dependency parsing together, instead of one after the other as
in most state-of-the-art works. We will focus in particular on the real estate
ad setting, aiming to convert an ad to a structured description, which we name
property tree, comprising the tasks of (1) identifying important entities of a
property (e.g., rooms) from classifieds and (2) structuring them into a tree
format. In this work, we propose a new joint model that is able to tackle the
two tasks simultaneously and construct the property tree by (i) avoiding the
error propagation that would arise from the subtasks one after the other in a
pipelined fashion, and (ii) exploiting the interactions between the subtasks.
For this purpose, we perform an extensive comparative study of the pipeline
methods and the new proposed joint model, reporting an improvement of over
three percentage points in the overall edge F1 score of the property tree.
Also, we propose attention methods, to encourage our model to focus on salient
tokens during the construction of the property tree. Thus we experimentally
demonstrate the usefulness of attentive neural architectures for the proposed
joint model, showcasing a further improvement of two percentage points in edge
F1 score for our application.Comment: Preprint - Accepted for publication in Expert Systems with
Application
Zero-Shot Cross-Lingual Transfer with Meta Learning
Learning what to share between tasks has been a topic of great importance
recently, as strategic sharing of knowledge has been shown to improve
downstream task performance. This is particularly important for multilingual
applications, as most languages in the world are under-resourced. Here, we
consider the setting of training models on multiple different languages at the
same time, when little or no data is available for languages other than
English. We show that this challenging setup can be approached using
meta-learning, where, in addition to training a source language model, another
model learns to select which training instances are the most beneficial to the
first. We experiment using standard supervised, zero-shot cross-lingual, as
well as few-shot cross-lingual settings for different natural language
understanding tasks (natural language inference, question answering). Our
extensive experimental setup demonstrates the consistent effectiveness of
meta-learning for a total of 15 languages. We improve upon the state-of-the-art
for zero-shot and few-shot NLI (on MultiNLI and XNLI) and QA (on the MLQA
dataset). A comprehensive error analysis indicates that the correlation of
typological features between languages can partly explain when parameter
sharing learned via meta-learning is beneficial.Comment: Accepted as long paper in EMNLP2020 main conferenc
Adversarial training for multi-context joint entity and relation extraction
Adversarial training (AT) is a regularization method that can be used to
improve the robustness of neural network methods by adding small perturbations
in the training data. We show how to use AT for the tasks of entity recognition
and relation extraction. In particular, we demonstrate that applying AT to a
general purpose baseline model for jointly extracting entities and relations,
allows improving the state-of-the-art effectiveness on several datasets in
different contexts (i.e., news, biomedical, and real estate data) and for
different languages (English and Dutch).Comment: EMNLP 2018, code is available at
https://github.com/bekou/multihead_joint_entity_relation_extractio
Solving Math Word Problems by Scoring Equations with Recursive Neural Networks
Solving math word problems is a cornerstone task in assessing language
understanding and reasoning capabilities in NLP systems. Recent works use
automatic extraction and ranking of candidate solution equations providing the
answer to math word problems. In this work, we explore novel approaches to
score such candidate solution equations using tree-structured recursive neural
network (Tree-RNN) configurations. The advantage of this Tree-RNN approach over
using more established sequential representations, is that it can naturally
capture the structure of the equations. Our proposed method consists in
transforming the mathematical expression of the equation into an expression
tree. Further, we encode this tree into a Tree-RNN by using different Tree-LSTM
architectures. Experimental results show that our proposed method (i) improves
overall performance with more than 3% accuracy points compared to previous
state-of-the-art, and with over 18% points on a subset of problems that require
more complex reasoning, and (ii) outperforms sequential LSTMs by 4% accuracy
points on such more complex problems